Interpretable machine learning for the prediction of death risk in patients with acute diquat poisoning

The aim of this study was to develop and validate predictive models for assessing the risk of death in patients with acute diquat (DQ) poisoning using innovative machine learning techniques. Additionally, predictive models were evaluated through the application of SHapley Additive ExPlanations (SHAP). A total of 201 consecutive patients from the emergency departments of the First Hospital and Shengjing Hospital of China Medical University admitted for deliberate oral intake of DQ from February 2018 to August 2023 were analysed. The initial clinical data of the patients with acute DQ poisoning were collected. Machine learning methods such as logistic regression, random forest, support vector machine (SVM), and gradient boosting were applied to build the prediction models. The whole sample was split into a training set and a test set at a ratio of 8:2. The performances of these models were assessed in terms of discrimination, calibration, and clinical decision curve analysis (DCA). We also used the SHAP interpretation tool to provide an intuitive explanation of the risk of death in patients with DQ poisoning. Logistic regression, random forest, SVM, and gradient boosting models were established, and the areas under the receiver operating characteristic curves (AUCs) were 0.91, 0.98, 0.96 and 0.94, respectively. The net benefits were similar across all four models. The four machine learning models can be reliable tools for predicting death risk in patients with acute DQ poisoning. Their combination with SHAP provides explanations for individualized risk prediction, increasing the model transparency.


Model performance comparisons
We generated four machine learning models to predict the risk of death in patients with acute DQ poisoning.The results show the discriminative performance of the four models in terms of ROC curves.Among the four models, the random forest model (AUC = 0.98) had the best predictive effect for death risk in patients with acute DQ poisoning, followed by the SVM model (AUC = 0.96), gradient boosting model (AUC = 0.94) and logistic regression (AUC = 0.91), as shown in Fig. 1A.The performance of the four models is shown in Table 2, with the random forest model achieving the highest F1-score (0.90), the highest MCC (0.79), the highest accuracy (0.90), and the lowest Brier score (0.07).The calibration curve is shown in Fig. 1B.With the exception of gradient boosting, whose Hosmer-Lemeshow χ2 value was 27.84 (p < 0.001), indicating poor calibration, the remaining Table 1.Baseline characteristics of the patients with survival and nonsurvival.WBC, white blood cell; Hb, haemoglobin; PLT, platelet; ALT, alanine aminotransferase; TBil, total bilirubin; DBil, direct bilirubin; ALB, albumin; K + , potassium; BUN, blood urea nitrogen; Cr, creatinine; Glu, glucose; TnI, troponin I; BNP, brain natriuretic peptide; PaO 2 , partial pressure of oxygen; PaCO 2 , partial pressure of carbon dioxide; DQd, diquat dose.www.nature.com/scientificreports/three models all demonstrated good calibration, as shown in Table 2. (A significant test statistic implies that the model does not calibrate perfectly 16 ).According to the DCA curve, all models provided decent net benefits, as shown in Fig. 1C, with similar net benefits at the 5% decision threshold, as shown in Table 2.

Feature importance and model interpretation
The SHAP were calculated to assess the importance of each feature.This process requires sequentially integrating features, starting with the most important feature, and gradually adding the next feature in order of importance 17 .
The contributions of all the features were essentially equal across the four models for survival (class = 0) and nonsurvival (class = 1) (Fig. 2). Figure 3 shows the bar graphs of the predictions for nonsurviving and surviving patients.F(x) is the log odds ratio for each observation.The arrows indicate the impact of each factor on the prediction.The blue and red arrows represent whether the factor decreased (blue) or increased (red) the risk of death, respectively.The longer the arrow is, the greater the effect.

Discussion
This study effectively predicted the risk of death in patients with acute DQ poisoning using interpretable machine learning methods and common clinical indicators.As the use of PQ has gradually decreased, the incidence of poisoning by its substitute herbicide, DQ, has gradually increased.Currently, there is no specific antidote for DQ poisoning; therefore, the death rate of poisoned patients is high 4 .Clinicians face great challenges in both the assessment and clinical treatment of such poisoning.Therefore, obtaining a simple and intuitive assessment method is highly important for quickly identifying the risk of death in acute and critical patients with rapid DQ intoxication.DQ is a potent redox cycler that is readily converted to a free radical, which, when reacted with molecular oxygen, generates superoxide anions and, subsequently, other redox products.These products can induce lipid peroxidation in cell membranes and potentially lead to cell death 18 .When DQ enters the body, it is reduced by receiving a single electron from NADPH, which is the primary source of reducing equivalents in cells, forming NADP + and a highly unstable DQ +• .In turn, DQ +• transfers an electron to molecular oxygen (O 2 ) to generate O 2 •+ .DQ +• can revert to its initial state and undergo this continuous process to generate large quantities of O 2 +• .This O 2 +• is subsequently neutralized spontaneously or through superoxide dismutase (SOD) activity, resulting in the formation of hydrogen peroxide (H 2 O 2 ) and O 2 19 .Under normal circumstances, H 2 O 2 is converted to water through the action of catalase and glutathione peroxidase.However, in the presence of a substantial increase in reactive oxygen species production, the defence mechanisms within the cell, such as nonenzymatic constituents  www.nature.com/scientificreports/or antioxidant enzymes, are overburdened, leading to oxidative stress.Consequently, cellular dysfunction and injury occur [20][21][22] .DQ is believed to significantly affect hepatic and renal toxicity through the involvement of free radicals 21 .This compound specifically induces damage to the kidney by affecting its excretory function, leading to conditions such as oliguria, anuria, proteinuria, haematuria, pyuria, azotemia, acute renal failure, and acute tubular necrosis 23,24 .In this study, consistent with previous findings, renal impairment was found to be a risk factor for death in patients with acute DQ poisoning.At the same time, DQ can also damage the liver, central nervous system, lungs, etc., as well as damage to the local reproductive system and the skin have also been reported 3,4,25 .Dyspnoea, pulmonary oedema, and respiratory depression are manifestations of pulmonary injury.However, unlike for PQ poisoning, there are no reports of pulmonary fibrosis caused by DQ poisoning 26,27  www.nature.com/scientificreports/alveolar epithelial cells 28 .Currently, there are no known remedies or successful treatments for DQ poisoning, and the focus of treatment has been on minimizing absorption and/or improving elimination 18,29 .This study is the first to apply machine learning to predict the risk of death from acute DQ poisoning.Machine learning models are widely used in clinical diagnostics, precision treatments, and health monitoring and have achieved good results 30,31 .Each model has its own advantages and disadvantages.For example, Random Forest has the benefit of fewer predictor variable assumptions than traditional modelling strategies and has minimal overfitting compared to simple classification and regression trees.However, Random Forest model has the fundamental issue of being a black box model.When alarms sound, medical staff are unsure of what immediate action to take until the patient is checked (cannot describe relationships within data 32 ).In this study, we employed machine learning combined with SHAP to assess the risk of death in patients with acute DQ poisoning.Previous studies primarily relied on logistic regression analysis and have not yet explored the application of machine learning.Consequently, there remains a dearth of evidence regarding the benefits of machine learning in predicting the risk of death in patients with DQ poisoning.Our results demonstrate that all four models exhibit strong performance, with Random Forest surpassing traditional logistic regression analysis in terms of efficiency, as indicated by the ROC curves.We further plotted the importance features of random forest.The results revealed that Cr, PaCO 2 , DQd, lactic acid, and WBC were important features for predicting death in patients with acute DQ poisoning.Higher levels of Cr, lactic acid, oral dosage of DQ, and WBC were associated with an increased risk of death, while lower levels of PaCO 2 were also correlated with a greater risk of death.Most poisoning cases are related to the intentional ingestion of concentrated liquid formulations.In this study, the results showed a direct relationship between DQ intake and patient death.With the increase in the oral dose of DQ, the death rate of patients increased significantly, consistent with the results of previous studies 33 that have shown that the ingestion of more than 15 mL of a rapid dose of 20% concentrated formulation of DQ is usually fatal.
The results of this study showed that the higher the lactic acid concentration, the greater was the risk of death.The prognostic ability of arterial lactate levels has been assessed in various critical care patient groups, including those with septic shock, circulatory shock, recent surgical procedures, burns, and trauma.The level of lactic acid has emerged as a reliable predictor of mortality in individuals with severe illness 34 .In previous studies on the prognosis of acute PQ poisoning, clinical cases from different countries have shown that lactic acid is a good predictive factor 35,36 .
In this study, a lower PaCO 2 suggested a greater risk of death.In one study, a decrease in PaCO 2 caused cerebral vasoconstriction, with a 1 mmHg change in PaCO 2 corresponding to a decrease in cerebral blood flow of 1.8 mL/100 g/min 37 .According to results, the WBC count is associated with poor outcomes.Many toxic diseases, such as acute organophosphate insecticide poisoning (AOPP), increase the WBC count, making it a poor indicator of prognosis 38 .In previous studies, in patients with acute PQ poisoning, an elevated WBC count was one of the indicators of poor prognosis 39 .
The random forest model had a higher F1-score, accuracy, AUC, and MCC, and the Brier score was also the lowest.Compared to the other models, its overall performance was slightly better.DCA demonstrated that the four models provided a good net benefit within a range of thresholds (Fig. 1C).Overall, all the four models demonstrated good predictive performance, with Random Forest performing slightly better.
The SHAP calculation method was used in this study, which shows a list of important features, from most important to least important (from top to bottom).All the features contributed equally to the prediction of nonsurvival and survival, but the feature weights contained in the different models were not the same (Fig. 2).We provided two examples to illustrate the interpretability of the model, one for a nonsurviving patient and one for a surviving patient (Fig. 3).All four models presented very consistent predictive results in a straightforward manner, enabling clinicians to clearly observe the weights contributed by the included features in the model www.nature.com/scientificreports/predictions.Individual predictors are greatly influenced by subjective factors; for example, the oral dose of patients is subjective and may not be very accurate, and vomiting dose, gastric lavage time, etc., affect the actual amount of absorption.Most earlier studies included only the patient's clinical test indicators and not their vital signs.This study combined objective indicators and patient status to objectively and intuitively evaluate the prognosis of patients with acute DQ poisoning.

Limitations
The sample size was small, which may have led to bias.In the future, we hope to continue to expand the sample size, summarize previous research experience, and strengthen the cooperation between basic and clinical studies to carry out high-quality clinical research for further demonstration.This research was based on a retrospective analysis; here, data were acquired from two distinct medical facilities, but due to limited data availability, the samples could not be divided into a testing group.Consequently, external validation is necessary to further evaluate the performance of our results.

Conclusion
Our study indicates that machine learning can accurately assess the risk of death in patients with acute DQ poisoning.Combining machine learning with SHAP provides clear explanations for individualized risk prediction, enabling physicians to intuitively understand the impact of key features in the model.

Source of data
This was a retrospective multicentre study, and the study design followed the

Study population and definition of outcome
The inclusion criteria were as follows: patients admitted for deliberate oral intake of DQ within 24 h, patients aged > 14 years, and patients whose haemoperfusion was not performed before presentation.Patients who had severe chronic comorbidities, including symptomatic heart failure, decompensated liver cirrhosis, chronic obstructive pulmonary disease, or chronic kidney disease, or who received dialysis treatment before admission were excluded.In-hospital death was considered the endpoint, and the patients were categorized into a survival group and a nonsurvival group.In addition to an evaluation of each patient's main complaint, the diagnosis of DQ was confirmed by urine colorimetric analysis, and patients with PQ intoxication or mixed intoxication with PQ were excluded.In emergency situations, when DQ poisoning is suspected, a rapid and simple colorimetric test can be performed by analysing urine by adding sodium bicarbonate or hydroxide, followed by sodium dithionite powder, which results in a green colour in the presence of DQ 4 .

Feature selection and data preprocessing
The following data of all patients were recorded in the medical record system: (a) demographic parameters, such as age and sex; (b) the estimated DQ intake dose, whether haemoperfusion was performed; (c) vital data, including the shock index (pulse/systolic blood pressure) and oxygen saturation, which were were recorded upon first admission; and (d) blood biochemical indicators, including white blood cell (WBC) count, haemoglobin (Hb), platelet (PLT), alanine aminotransferase (ALT), total bilirubin (TBil), direct bilirubin (DBil), albumin (ALB), potassium (K + ), blood urea nitrogen (BUN), creatinine (Cr), glucose (Glu), troponin I (TnI), brain natriuretic peptide (BNP), pH, partial pressure of oxygen (PaO 2 ), partial pressure of carbon dioxide (PaCO 2 ), and lactic acid, which were measured at the first admission.To improve the accuracy of the model, we used a normalization method to scale all the variables and map the data to the [0,1] interval.Missing and extreme values were deleted, and no imputation was performed.In this study, there were few missing values and outliers.Considering the modelling accuracy, missing values and outliers were deleted rather than imputed, as shown in Fig. 4.

Statistical analysis
Continuous variables are presented herein as the means (SDs) or medians (IQRs).For comparisons according their suitability, Student's t test or the Mann-Whitney U test was used.Categorical variables are presented as numbers (percentages) and were compared with the χ 2 test.
Four machine learning methods, namely, logistic regression, random forest, support vector machine (SVM) and gradient boosting, were employed for model construction.The samples were randomized into a training set (80%) and a test set (20%).Subsequently, the performance of each model was validated and compared using the test set.In our study, the model with the highest area under the curve (AUC) of the receiver operating characteristic (ROC) curve was selected as the optimal model.The 95% confidence interval (CI) for the area under the curve (AUC) was calculated using the bootstrap method (1000 iterations).Next, calibration curves were plotted to assess the calibration of the four models, accompanied by the Hosmer-Lemeshow test.We calculated the F1 score, accuracy, Matthews correlation coefficient (MCC), precision, recall and Brier score.The net benefit of patients was evaluated through clinical decision curve analysis (DCA).SHAP were used to explain model features

Figure 1 .
Figure 1.The performance of the models.SVM, support vector machine.(A) The receiver operating characteristic curves of the four models.(B) The calibration curves of the models.(C) The decision curve analyses of the four models.

Table 2 .
Performance of the four models.AUC , area under the curve; MCC, Matthews correlation coefficient; SVM, support vector machine.
Vol.:(0123456789) Scientific Reports | (2024) 14:16101 | https://doi.org/10.1038/s41598-024-67257-6 Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) reporting guidelines.From February 2018 to August 2023, 201 consecutive patients with deliberate oral DQ poisoning were retrospectively reviewed; these included 93 patients from the emergency department of the First Hospital and 108 patients from the emergency department of Shengjing Hospital of China Medical University.The study protocol was approved by the Ethics Committee of the First Hospital of China Medical University (approval no.2023[330]).All the data were analysed anonymously, and the need to obtain informed consent from the patients was waived.